---
title: Performance Quick Wins
description: Practical tips to optimize Agno knowledge base performance, improve search quality, and speed up content loading.
---

Most knowledge bases work great with Agno's defaults. But if you're seeing slow searches, memory issues, or poor results, a few strategic changes can make a big difference.

## When to Optimize

Don't prematurely optimize. Focus on performance when you notice:

- **Slow search** - Queries taking more than 2-3 seconds
- **Memory issues** - Out of memory errors during content loading
- **Poor results** - Search returning irrelevant chunks or missing obvious matches
- **Slow loading** - Content processing taking unusually long

If things are working fine, stick with the defaults and focus on building your application.

## The 80/20 of Performance

These five changes give you the biggest performance boost for the least effort:

### 1. Pick the Right Vector Database

Your database choice has the biggest impact on performance at scale:

```python
from agno.vectordb.lancedb import LanceDb
from agno.vectordb.pgvector import PgVector

# Development: Fast, local, zero setup
dev_db = LanceDb(
    table_name="dev_knowledge",
    uri="./local_db"
)

# Production: Scalable, battle-tested
prod_db = PgVector(
    table_name="prod_knowledge",
    db_url="postgresql+psycopg://user:pass@db:5432/knowledge"
)
```

**Guidelines:**
- **LanceDB** for development and testing (no setup required)
- **PgVector** for production (up to 1M documents, need SQL features)
- **Pinecone** for managed services (no ops overhead, auto-scaling)

### 2. Skip Already-Processed Files

The single biggest speed-up for re-running your ingestion:

```python
# Skip files you've already processed
knowledge.add_content(
    path="large_document.pdf",
    skip_if_exists=True,  # Don't reprocess existing files
    upsert=False          # Don't update existing
)

# For batch loading
knowledge.add_contents(
    paths=["docs/", "policies/"],
    skip_if_exists=True,
    include=["*.pdf", "*.md"],
    exclude=["*temp*", "*draft*"]
)
```

### 3. Use Metadata Filters

Narrow searches before vector comparison for faster, more accurate results:

```python
# Slow: Search everything
results = knowledge.search("deployment process", max_results=10)

# Fast: Filter first, then search
results = knowledge.search(
    query="deployment process",
    max_results=10,
    filters={"department": "engineering", "type": "procedure"}
)

# Validate your filters to catch typos
valid_filters, invalid_keys = knowledge.validate_filters({
    "department": "engineering",
    "invalid_key": "value"  # This gets flagged
})
```

### 4. Match Chunking Strategy to Your Content

Different strategies have different performance characteristics:

| Strategy | Speed | Quality | Best For |
|----------|-------|---------|----------|
| **Fixed Size** | Fast | Good | Uniform content, when speed matters |
| **Semantic** | Slower | Best | Complex docs, when quality matters |
| **Recursive** | Fast | Good | Structured docs, good balance |

```python
from agno.knowledge.chunking.fixed import FixedSizeChunking
from agno.knowledge.chunking.semantic import SemanticChunking

# Fast processing for simple content
fast_chunking = FixedSizeChunking(
    chunk_size=800,
    overlap=80
)

# Better quality for complex content (but slower)
quality_chunking = SemanticChunking(
    chunk_size=1200,
    similarity_threshold=0.5
)
```

Learn more about [choosing chunking strategies](/concepts/knowledge/chunking/overview).

### 5. Use Async for Batch Operations

Process multiple items concurrently:

```python
import asyncio

async def load_knowledge_efficiently():
    # Load multiple content sources in parallel
    tasks = [
        knowledge.add_content_async(path="docs/hr/"),
        knowledge.add_content_async(path="docs/engineering/"),
        knowledge.add_content_async(url="https://company.com/api-docs"),
    ]
    await asyncio.gather(*tasks)

asyncio.run(load_knowledge_efficiently())
```

## Common Performance Pitfalls

### Issue: Search Returns Irrelevant Results

**What's happening:** Chunks are too large, too small, or chunking strategy doesn't match your content.

**Quick fixes:**
1. Check your chunking strategy - try semantic chunking for better context
2. Verify content actually loaded: `knowledge.get_content_status(content_id)`
3. Increase `max_results` to see if relevant results are just ranked lower
4. Add metadata filters to narrow the search scope

```python
# Debug search quality
results = knowledge.search("your query", max_results=10)
if not results:
    content_list, count = knowledge.get_content()
    print(f"Total content items: {count}")
    
    # Check for failed content
    for content in content_list[:5]:
        status, message = knowledge.get_content_status(content.id)
        print(f"{content.name}: {status}")
```

### Issue: Content Loading is Slow

**What's happening:** Processing large files without batching, or using semantic chunking on huge datasets.

**Quick fixes:**
1. Use `skip_if_exists=True` to avoid reprocessing
2. Switch to fixed-size chunking for faster processing
3. Process in batches instead of all at once
4. Use file filters to only process what you need

```python
# Batch processing for large datasets
import os

def load_content_in_batches(knowledge, content_dir, batch_size=10):
    files = [f for f in os.listdir(content_dir) if f.endswith('.pdf')]
    
    for i in range(0, len(files), batch_size):
        batch_files = files[i:i+batch_size]
        print(f"Processing batch {i//batch_size + 1}")
        
        for file in batch_files:
            knowledge.add_content(
                path=os.path.join(content_dir, file),
                skip_if_exists=True
            )
```

### Issue: Running Out of Memory

**What's happening:** Loading too many large files at once, or chunk sizes are too large.

**Quick fixes:**
1. Process content in smaller batches (see code above)
2. Reduce chunk size in your chunking strategy
3. Use `include` and `exclude` patterns to limit what gets processed
4. Clear old/outdated content regularly with `knowledge.remove_content_by_id()`

```python
# Process only what you need
knowledge.add_contents(
    paths=["large_dataset/"],
    include=["*.pdf"],       # Only PDFs
    exclude=["*backup*"],    # Skip backups
    skip_if_exists=True,
    metadata={"batch": "current"}
)
```

## Advanced Optimizations

Once you've applied the quick wins above, consider these for further improvements:

### Use Hybrid Search

Combine vector and keyword search for better results:

```python
from agno.vectordb.pgvector import PgVector, SearchType

vector_db = PgVector(
    table_name="knowledge",
    db_url="postgresql+psycopg://user:pass@localhost:5432/db",
    search_type=SearchType.hybrid  # Vector + keyword search
)
```

### Add Reranking

Improve result quality by reranking with Cohere:

```python
from agno.knowledge.reranker.cohere import CohereReranker

vector_db = PgVector(
    table_name="knowledge",
    db_url="postgresql+psycopg://user:pass@localhost:5432/db",
    reranker=CohereReranker(
        model="rerank-multilingual-v3.0",
        top_n=10
    )
)
```

### Optimize Embedder Dimensions

Reduce dimensions for faster search (with slight quality trade-off):

```python
from agno.knowledge.embedder.openai import OpenAIEmbedder

# Smaller dimensions = faster search, lower cost
embedder = OpenAIEmbedder(
    id="text-embedding-3-large",
    dimensions=1024  # Instead of full 3072
)
```

## Monitoring Performance

Keep an eye on these metrics:

```python
# Check content processing status
content_list, total_count = knowledge.get_content()

failed = [c for c in content_list if c.status == "failed"]
if failed:
    print(f"Failed items: {len(failed)}")
    for content in failed:
        status, message = knowledge.get_content_status(content.id)
        print(f"  {content.name}: {message}")

# Time your searches
import time

start = time.time()
results = knowledge.search("test query", max_results=5)
elapsed = time.time() - start
print(f"Search took {elapsed:.2f} seconds")
```





## Next Steps

<CardGroup cols={2}>
  <Card title="Chunking Strategies" icon="scissors" href="/concepts/knowledge/chunking/overview">
    Learn how different chunking strategies affect performance
  </Card>
  <Card title="Vector Databases" icon="database" href="/concepts/vectordb/overview">
    Compare vector database options for your scale
  </Card>
  <Card title="Embedders" icon="vector-square" href="/concepts/knowledge/embedder/overview">
    Choose the right embedder for your use case
  </Card>
  <Card title="Hybrid Search" icon="magnifying-glass" href="/concepts/knowledge/advanced/hybrid-search">
    Combine vector and keyword search for better results
  </Card>
</CardGroup>

<Tip>
**Start simple, optimize when needed.** Agno's defaults work well for most use cases. Profile your application to find actual bottlenecks before spending time on optimization.
</Tip>